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Abstract 

Modality is one of the important components of grammar in linguistics. It 
lets speaker to express attitude towards, or give assessment or potentiality 
of state of affairs. It implies different senses and thus has different percep¬ 
tions as per the context. This paper presents an account showing the gap 
in the functionality of the current state of art Natural Language Processing 
(NLP) systems. The contextual nature of linguistic modality is studied. In 
this paper, the works and logical approaches employed by Natural Language 
Processing systems dealing with modality are reviewed. It sees human cog¬ 
nition and intelligence as multi-layered approach that can be implemented 
by intelligent systems for learning. Lastly, current flow of research going on 
within this held is talked providing futurology. 
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1. Introduction 

Research in NLP is gaining interest as its applications are becoming more 
signihcant. Natural language is highly ambiguous and understanding it ef¬ 
fectively can be considered as a primary concern. Modality in language is 
associated with contextual understanding and implied perceptions. Dehning 
modality from a computational linguistics perspective is somewhat difficult 
because several concepts are used to refer to phenomena that are related to 
modality, depending on the task at hand and the specihc singularities that 
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the speaker addresses. There are different senses that can be articulated by 
modality. Understanding precise sense conveyed is important. 

The major tasks involved in dealing with modality in text are: detecting 
the occurrence of modality, categorizing the type of modality and perceiving 
the sense conveyed through it. Parsers and language processing tools iden¬ 
tifies the occurrence of modals by Part of Speech (POS) tagging. Talking 
about type of modality, in the literature, standard classification of types is 
not available as various types are dehned. But we consider the classihcation 
broadly as in two types: epistemic and deontic. According to Palmer (2014), 
epistemic is used by the speakers to express their judgment about the factual 
status of the proposition. Whereas deontic modality is concerned with the 
speakers directive attitude towards an action to be carried out. It relates 
to obligation or permission and to conditional factors that are external to 
the relevant individual. The senses of necessity and possibility are incorpo¬ 
rated by epistemic type while those of permission and obligation by deontic. 
Modality type categorization and the sense recognition has been carried out 
by annotation approaches. 

This paper will closely observe the methods applied for the abovemen- 
tioned tasks and also review the works in the subject field. It will then 
highlight shortcomings of the state of art natural language systems in con¬ 
text to linguistic modality that forming the gap in their functionality. Latter 
sections will be covering the directions for the scope of improvement in the 
same context. 

2. Context Review 

As tagging and annotating are the two key aspects, lets summarize them 
regarding to the language processing. The process of tagging involves assign¬ 
ing tags to each word in the corpus corresponding to the part of speech that 
it embodies. The Part of Speech tagging is an essential subtask in language 
processing and is very useful for subsequent phases like syntactic parsing. 
The tags into which a token can possibly classified depends on the tagset 
adopted for this task. That is, it depends upon the Treebank taken into use 
which defines the directory of the tags. 

Annotation, on the other hand, uses machine learning approach. It uses 
pre-annotated corpus to learn and annotate plain text. The annotations 
made are markups usually representing extended features. The extended fea¬ 
tures may include polarity, certainty, subjectivity, sense and type of modality. 
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event mentions, etc. POS tagging is also carried out with machine learn¬ 
ing methods in a similar way as annotation (Manning and Schiitze, 1999; 
Ratnaparkhi et al, 1996; Toutanova and Manning, 2000). 

2.1. Detecting occurrence 

A number of different expressions in language can have modal meanings. 
Von Fintel (2006) discusses on a subset of variety of modal expressions. Tak¬ 
ing into account the tagging output of Stanford NLP parser of few sample 
sentences with different expressions, some points will be highlighted. 

In case of modal auxiliaries used, the parser tags the same with MD i.e. 
modal. 

1. Sandy must be home. 

Sandy/NNP must/MD be/VB home/NN ./. 

Thus any explicit occurrence of modal auxiliaries such as must, should/shall, 
might, may, could/can will be detected well and clear. 

Whereas semi-modals like has to, need to, ought to follow different treat¬ 
ment. Sometimes ought to is considered as modal and sometimes as semi- 
modal because of difference in its syntax. Anyhow it is tagged as a modal. 
Rest of the semi-modals are not tagged as modal; but auxiliary verbs and 
structure of sentence can be identihed by parsers. 


Sandy has to be home. 
Sandy/NNP has/VBZ 

to/TO 

be/VB 

home/NN 

•A 

Sandy ought to be home. 
Sandy/NNP ought/MD 

to/TO 

be/VB 

home/NN 

•/ 


Apart from modal auxiliaries and semi-modals, modal meaning can also 
be conveyed using adverbs (perhaps, probably, etc.), nouns (possibility, ne¬ 
cessity, etc.), adjectives (bound, certain, etc.), and also conditional constructs 
(if., then..). 
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4. It is far from necessary that Sandy is home. 

It/PRP is/VBZ far/RB from/IN necessary/JJ that/IN 

Sandy/NNP is/VBZ home/NN ./. 

5. There is a slight possibility that Sandy is home. 

There/EX is/VBZ a/DT slight/JJ possibility/NN that/IN 
Sandy/NNP is/VBZ home/NN ./. 

6. Perhaps Sandy is home. 

Perhaps/RB Sandy/NNP is/VBZ home/NN ./. 

7. If the light is on, Sandy is home. 

If/IN the/DT light/NN is/VBZ on/IN ,/, Sandy/NNP 

is/VBZ home/NN ./. 


Althongh, the advances in calibration of parsers has improved the ability 
to tag words accurately, but above certain point, the mechanism seem to 
become insufficient to gather underlying information that is not superhcial 
or apparent. 

2.2. Type Categorization and Sense Perception 

Categorization of type of modality in text and the identihcation of sense 
conveyed can be done by developing annotation schemes. There are works 
that accommodate annotating of those features, but not necessarily they are 
organized with study of languages perspective; which makes it difficult to 
summarize and separate out the relevant points of interest. Upon review¬ 
ing the literature, it can be seen that various annotating schemes has been 
constructed over time for marking annotations of different components. 

Baker et al (2012) described a modality/negation (MN) annotation scheme 
which isolates three components of modality and negation: a trigger (that is 
source of modality or negation), a target (action associated with modality or 
negation) and a holder (the experiencer of modality). Moreover they have 
constructed MN lexicon and two automated MN taggers using the annotation 
scheme. 

Ruppenhofer and Rehbein (2012) presents annotation scheme that anno¬ 
tates type of English modals in MPQA corpus. The modal verbs targeted 
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were can/could, may/might, must, ought, shall/should. The annotation in¬ 
volved categorization of the modals in to six types: epistemic, deontic, dy¬ 
namic, optative, concessive, conditional. 

Pakray et al (2012) experimented on QA4MRE data sets to identify modal¬ 
ity and negation in text and assign labels mod, neg, neg-mod, none for oc¬ 
currences of modal, negation, modality and negation, and absence of both 
modality and negation respectively. And the detected modals were catego¬ 
rized into epistemic and deontic. 

Hendrickx et al (2012) presents a scheme for annotation of modality in 
Portuguese, using MMAX2 tool. The components annotated were: trigger 
(the element conveying the modal value), target (the expression in the scope 
of the trigger), source of the event mention (speaker or writer), and source 
of the modality (agent or experiencer). Also, for trigger, two attributes were 
specihed: modal value and polarity. They stated thirteen different types into 
which the modal value could be categorized. 

Rubinstein et al (2013) proposed hue-grained annotation approach utiliz¬ 
ing MPQA corpus and MMAX2 tool. It was said to be hne-grained as it ex¬ 
tends some of the previous works with a number of novel features to improve 
detection and interpretation of modals in text. The features annotated were: 
modality type, polarity, propositional arguments, source, background, modi¬ 
fied element, degree indicator, outscoping guantifier and lemma. The types 
into which modality was categorized are: hue grained types: epistemic, cir¬ 
cumstantial, ability, deontic, bouletic, teleological, and bouletic/teleological; 
and the coarse grained types: epistemic or circumstantial, ability or circum¬ 
stantial, and priority. 

On surveying other studies carried out using annotations, apparently it 
seems that more and more attributes were annotated to make text under¬ 
standing precise. Certain works moved in the direction of subjectivity anal¬ 
ysis; some in certainty analysis; whereas others focused on time, events and 
temporal analysis; all of them being implicitly useful in the study of type 
and/or sense understanding of modality. 

Different types of subjectivity is implied in discourse by different types, 
epistemic and deontic modals. The relationship between subjectivity and 
modality is elaborately discussed in Sanders and Spooren (1997). 

Rubin et al. (2004; 2006) presented a certainty categorization model 
based on four hypothesized dimensions and tested the model on a sample of 
news articles. Rubin (2010) identihes that certainty can be seen as a variety 
of epistemic modality expressed in form of markers like probably, perhaps. 
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undoubtedly, etc. 

Nissim et al (2013) proposed modality annotation model that they say 
to be two layered. Factuality and speakers attitude being two components 
marked, they also plan to make the model more coherent by annotating 
strength of modality. 

Matsuyoshi et al (2010) clearly draws attention to the point that recent 
developments in language processing has improved precision but is insuf- 
hcient for applications such as information extraction, question answering 
and recognizing textual entailment. Such applications require more informa¬ 
tion such as modality, polarity, and other associated information collectively 
referred as extended modality. Matsuyoshi proposes an annotation scheme 
that represents extended modality and consists of seven components: source, 
time, conditional, primary modality type, actuality, evaluation, and focus. 
Utilizing the work, they also constructed an annotated corpus in Japanese. 

Emphasizing upon event modality, Sauri et al. (2006a) says that modality 
is an important component of discourse together with other levels of infor¬ 
mation such as argument structure and temporal information. That made 
an apparent need for a more sophisticated approach that is sensitive to such 
additional information. They worked on annotation scheme that annotates 
event modality and also identifying its scope using TimeML. Sauri (2006b) 
has also worked on SlinkET attempting the construction of modal parser for 
events. Pustejovsky et al. (2003) built the TimeBank corpus which is an¬ 
notated with information like modals, events, times, relation between events 
and temporal expressions. Sauri and Pustejovsky (2009) built the FactBank 
on the basis of TimeBank, where events are assigned with different degrees of 
factuality according to their source-introducing predicates (SIPS) and source. 
Different degrees of factuality are determined by different degrees of certainty 
and polarity axes. Degrees of certainty include certain, probable, possible and 
unknown. Whereas polarity axis contained positive, negative and underspec¬ 
ified. 

3. Gap and Futurology 

Linguistic modality is one of the components of discourse that is asso¬ 
ciated with context, sense and meaning, mental and real spaces, and force 
dynamics. 

Also there is no proper well defined classihcation of different types and 
senses of modality in linguistic literature. This lack of taxonomy has con- 
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trived enormous confusions regarding types and senses. Considering the fact 
that the section is complex and modality in discourse has many aspects as¬ 
sociated with it, and its wider scope, even language scientists can contribute 
towards it. 

First point noted is that different kinds of expressions can be used to 
convey modal meanings. And we saw that the tagging approach is limited 
to tag modals explicit use in text. Moreover, taggers don't put any further 
light on the type and sense of the modality. Though attempts have made on 
type and sense classihcation using annotation methods, but due to flexibility 
of meanings, it is difficult to standardize. Flexibility of meaning means the 
modal verb has different meaning according to context. Modality in language 
has contextual meanings and implied perceptions. And mechanisms fail to 
identify perspective, aspect and contextuality. 

Another drawback is dependency in both methods; tagging as well as 
annotating. The process of tagging is dependent on the tagset and the an¬ 
notation scheme is limited by its own training corpus. Although this lexicon 
dependency of the available approaches are useful for preliminary passes of 
processing but not an effective way for understanding natural language which 
includes cognition and perception. 

At this point, inspiring from human information processing mechanisms 
would be seen appropriate. In this context understanding of human cognitive 
process can definitely enlighten the path of development to make our systems 
artihcially intelligent. Computational models of cognition, creative insight, 
skill acquisition and the design of instructional software, as well as other top¬ 
ics in higher cognition needs to be reconsidered. It is noteworthy to see the 
humans perceptive systems, visual or speech/audio, etc., both are essentially 
layered and hierarchical in structure. Thus, it is natural to believe that the 
state-of-the-art can be advanced in processing these types of natural lan¬ 
guage if structurally efficient and effective learning model can be developed. 
The layered structure of human learning shown in Kaplan & Sadock’s Com¬ 
prehensive Textbook of Psychiatry (Sadock, 2000, hg. 24-1) is instrumental 
in visualization of complexity and multi-stage structure involved. Moreover, 
the interconnection and synergy between the layers is equally vital. 

Real systems are dynamic in nature and reform continuous shift that 
produce perceptual difference. If the shift is in upward direction (in hierar¬ 
chical multi-layered model) that results in high dimensionality in nature of 
expression. Thus the expression becomes nonspecific and losses subjectiv¬ 
ity. Nonspecific discourse are too complex to interpret and to reach to any 
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conclusion is very difficult. From viewing the software dealing with language 
processing looks in direction and trends accordance to the subject Artihcial 
Intelligence (AI) and Robotics that attempts to mimic the dynamic and be¬ 
havioral output from human. In context of natural language processing with 
special reference to modality processing noticeable development observed in 
static format of expressions and expression of contextuality also attempted 
within one document as in MMAX2 from German NLP group. 

Neural network hidden layer processing and continuous modihcation were 
successfully executed in mechanical output in held of machine learning. The 
directions were explored with annotated titles like deep learning and lay¬ 
ered approach handling are most popular among research works in language 
processing. 

As per the objectives of a system, some key features that should be consid¬ 
ered such as; relevant well-structured knowledge base with improved feature 
space should be formed upon each processing. And this knowledge base must 
be dynamically updated (active learning that means continuous updatation 
of the knowledge base by each layer of the model) so that it can effectively 
be useful in applications of the system. 

Machine learning has been a dominant tool in NLP for many years. How¬ 
ever, the use of traditional machine learning in NLP has been mostly lim¬ 
ited to numerical optimization of weights for human designed representations 
and features from the text data. The goal of representation learning is to 
automatically develop features or representations from the raw text material 
appropriate for a wide range of NLP tasks. 

Deep learning is gaining popularity very recently as it provides levels of 
abstraction. The multi-layered architecture formed due to the levels ensure 
natural progression from low level to high level structure as seen in natural 
complexity. Deep learning works on the principle of formation of learning 
representations. The essence of deep learning is to automate the process 
of discovering effective features or representations for any machine learning 
task, including automatically transferring knowledge from one task to an¬ 
other concurrently. In regard to NLP, deep learning develops and makes use 
an important concept called embedding, which refers to the representation 
of symbolic information in natural language text at word-level, phrase-level, 
and even sentence-level in terms of continuous-valued vectors. 

Another concept of multi-task learning has also shown improvements in 
learning approaches. Multi-task learning is a machine learning approach that 
learns to solve several related problems at the same time, using a shared 


representation. It can be regarded as one of the two major classes of transfer 
learning or learning with knowledge transfer, which focuses on generalizations 
across distributions, domains, or tasks. The other major class of transfer 
learning is adaptive learning, where knowledge transfer is carried out in a 
sequential manner, typically from a source task to a target task. 

3.1. Recent works with deep architectures in NLP 

Variety of deep architectures like neural networks, deep belief networks 
and others has shown signihcant performance in various applications of lan¬ 
guage processing including other helds. 

Collobert et al (2011) provide a comprehensive review on ways of apply¬ 
ing unihed neural network architectures and related deep learning algorithms 
to solve NLP problems from scratch, meaning that no traditional NLP meth¬ 
ods are used to extract features. The recent work by Mikolov et al (2013a) 
derives word embeddings by simplifying the Neural Network Language Model 
(NNLM). It is found that the NNLM can be successfully trained in two 
steps. Yet another deep learning approach to machine translation appeared 
in Mikolov et al (2013b). 

One most interesting NLP task recently tackled by deep learning meth¬ 
ods is that of knowledge base (ontology) completion, which is instrumental in 
question-answering and many other NLP applications. An early work in this 
space came from Bordes et al (2011), where a process is introduced to auto¬ 
matically learn structured distributed embeddings of knowledge bases. The 
proposed representations in the continuous-valued vector space are compact 
and can be efficiently learned from large-scale data of entities and relations. 
A specialized neural network architecture is used. In the follow-up work that 
focuses on multi-relational data (Bordes et al, 2014), the semantic matching 
energy model is proposed to learn vector representations for both entities 
and relations. 

Other recent works Socher et al (2013) and Bowman (2013), adopts an 
approach, based on the use of neural tensor networks, to attack the problem 
of reasoning over a large joint knowledge graph for relation classihcation. 
The knowledge graph is represented as triples of a relation between two en¬ 
tities, and the authors aim to develop a neural network model suitable for 
inference over such relationships. The model they presented is a neural ten¬ 
sor network, with one layer only but it would be encouraged to work further 
on multi-layered network models. The network is used to represent entities 
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in fixed-dimensional vectors, which are created separately by averaging pre¬ 
trained word embedding vectors. It then learns the tensor with the newly 
added relationship element that describes the interactions among all the la¬ 
tent components in each of the relationships. Experimentally, Socher et ah, 
shows that this tensor model can effectively classify unseen relationships in 
WordNet and FreeBase. Thus, models built on tensors can contribute upto 
certain extent for reasoning over relationships between entities enhancing 
knowledge bases for improved performance. Works utilizing Recursive Neu¬ 
ral Networks (RNN) for syntactic parsing and word representations has been 
performed in Luong et al (2013); Socher et al (2010). Deep neural networks 
have been popular and are well performing as they are intrinsically multi¬ 
layered in structure. 

Deep learning is a hot area of research and there is still much potential 
for signihcant advances. It can be said that the paradigm of deep learning 
architectures can improve the results of our models upto quite a certain ex¬ 
tent; but there is still a limit to it considering the whole problem statement. 
This is because the deep neural networks are yet a kind of black box model 
in terms of functionality. By revising models and designs enhancement in 
performance of deep learning algorithms can surely be made as per specihc 
application domain but there would be a bound to the possible improvements 
and the available approaches wouldnt provide enough means to the desired 
level of Artihcial Intelligence. Approaches that are multi-layered and prefer¬ 
ably white box models would be essentially important for organization and 
control of intermediate layers functionalities. 

As mentioned above, not only for application in NLP but for any ap¬ 
plication that deals with dynamics of real world conditions, development 
of complex system using combination of several simple modular and multi¬ 
layered hierarchal architectures with the key features will be helpful. And 
hopefully such models can attain better linguistic understanding accuracy 
that would be contributory to the application expertise. 
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